The U1 antisense morpholino oligonucleotide (AMO) disrupts U1 snRNP structure to promote intronic PCPA modification of pre-mRNAs

Functional depletion of the U1 small nuclear ribonucleoprotein (snRNP) with a 25 nt U1 AMO (antisense morpholino oligonucleotide) may lead to intronic premature cleavage and polyadenylation of thousands of genes, a phenomenon known as U1 snRNP telescripting; however, the underlying mechanism remains elusive. In this study, we demonstrated that U1 AMO could disrupt U1 snRNP structure both in vitro and in vivo, thereby affecting the U1 snRNP–RNAP polymerase II interaction. By performing chromatin immunoprecipitation sequencing for phosphorylation of Ser2 and Ser5 of the C-terminal domain of RPB1, the largest subunit of RNAP polymerase II, we showed that transcription elongation was disturbed upon U1 AMO treatment, with a particular high phosphorylation of Ser2 signal at intronic cryptic polyadenylation sites (PASs). In addition, we showed that core 3′processing factors CPSF/CstF are involved in the processing of intronic cryptic PAS. Their recruitment accumulated toward cryptic PASs upon U1 AMO treatment, as indicated by chromatin immunoprecipitation sequencing and individual-nucleotide resolution CrossLinking and ImmunoPrecipitation sequencing analysis. Conclusively, our data suggest that disruption of U1 snRNP structure mediated by U1 AMO provides a key for understanding the U1 telescripting mechanism.

Functional depletion of the U1 small nuclear ribonucleoprotein (snRNP) with a 25 nt U1 AMO (antisense morpholino oligonucleotide) may lead to intronic premature cleavage and polyadenylation of thousands of genes, a phenomenon known as U1 snRNP telescripting; however, the underlying mechanism remains elusive. In this study, we demonstrated that U1 AMO could disrupt U1 snRNP structure both in vitro and in vivo, thereby affecting the U1 snRNP-RNAP polymerase II interaction. By performing chromatin immunoprecipitation sequencing for phosphorylation of Ser2 and Ser5 of the Cterminal domain of RPB1, the largest subunit of RNAP polymerase II, we showed that transcription elongation was disturbed upon U1 AMO treatment, with a particular high phosphorylation of Ser2 signal at intronic cryptic polyadenylation sites (PASs). In addition, we showed that core 3 0 processing factors CPSF/CstF are involved in the processing of intronic cryptic PAS. Their recruitment accumulated toward cryptic PASs upon U1 AMO treatment, as indicated by chromatin immunoprecipitation sequencing and individualnucleotide resolution CrossLinking and ImmunoPrecipitation sequencing analysis. Conclusively, our data suggest that disruption of U1 snRNP structure mediated by U1 AMO provides a key for understanding the U1 telescripting mechanism.
Notably, most conclusions are based on quantitative mass spectrometry (MS) analysis. In addition, although previous RNAPII chromatin immunoprecipitation (ChIP) sequencing analyses have suggested that U1 snRNP telescripting is correlated with transcription (3,5), the underlying mechanisms remain elusive.
Here, we investigated the effect of this U1 AMO on U1 snRNP and unexpectedly found that it could disrupt U1 snRNP structure both in vitro and in vivo. We further provided evidence that U1 AMO may directly impact the association of U1 snRNP and RNAPII, thereby impacting the transcription activity of the latter. This U1 AMO-induced premature transcription termination might cause premature cleavage and polyadenylation (PCPA). Overall, our data provide an up-todate understanding of U1 snRNP telescripting mechanism.

U1 AMO may affect U1 snRNP structural integrity in vitro
We hypothesized that the 25 nt U1 AMO, routinely used in previous experiments for functional knockdown of U1 snRNA, might disrupt U1 snRNP structural integrity based on two observations. First, in the nuclear extract (NE) prepared from HeLa cells, coimmunoprecipitation (co-IP) analysis using anti-U1C antibody showed that U1-70K and U1A, two other U1 snRNP-specific proteins, were significantly downregulated in the IPed sample using U1 AMO-treated NE (9). Second, it is well established that U1-70K can bind U1 snRNA stem loop 1 (SL1) (Fig. 1A) (10)(11)(12), which is close to the region where 25 nt U1 AMO binds, and U1C was also reported to be associated with U1 snRNA at the 5 0 end within U1 snRNP complex (13). Therefore, U1 AMO might interfere with the associations of U1 snRNA and U1-70K-U1C.
We first performed a similar co-IP experiment to test the aforementioned hypothesis. Instead of an anti-U1C antibody, we used an anti-U1A antibody. Consistent with earlier findings (9), we observed a mild decrease in U1-70K protein and an apparent reduction in U1-C protein in the IPed products for U1 AMO-treated samples (Fig. 1B). The results suggest that U1 AMO might interfere with protein-protein interactions within the U1 snRNP complex in the NE. To test if U1 AMO could affect protein-RNA interactions within the U1 snRNP complex, we took advantage of the knowledge that U1-70K and U1A could directly bind U1 snRNA and performed a gel mobility shift assay in the presence of control and U1 AMO. Recombinant glutathione-S-transferase (GST)-U1-70K/U1A/ U1C proteins were prepared using the constructs (gift from Dr Marc-Étienne Huot, Université Laval; confirmed by Sanger sequencing) characterized in previous work (14). As expected, recombinant GST-U1-70K and U1A proteins formed stable protein-RNA complexes with U1 snRNA in the presence of control AMO (Fig. S1, A and B). However, we found that U1 AMO could efficiently block U1-70K binding of U1 snRNA even at low concentration (0.2 μM), wherein U1-70K-U1 snRNA was near a 1:1 ratio (Fig. 1C). To further substantiate that the supershifted band corresponds to U1-70K-U1 snRNA complex, the SL1 deletion mutant U1 snRNA was prepared and subsequently used for the same gel mobility shift assay. Indeed, a sharp decrease in the signal of the supershifted band was observed (Fig. S1C). In contrast, the protein-RNA interaction of GST-U1A and U1 snRNA was unaffected in the tested conditions. We predicted that U1 AMO might block U1-70K binding through steric hindrance. We employed a DNA oligo harboring the same sequence with 25 nt U1 AMO and performed the aforementioned gel shift assay to test this. Indeed, similar to AMO, 25 nt antisense DNA oligo could also efficiently block U1-70K-U1 snRNA association (Fig. 1D). The results motivated us to subsequently test the blocking efficiency of a series of truncated forms of U1 antisense DNA. As expected, the shortening of the 5 0 end of antisense DNA gradually alleviated the inhibitory effect, which coincided with the fact that the binding region of the 5 0 end of antisense DNA was close to U1 snRNA SL1; however, the 3 0 end was unaffected (Figs 1D and S1D).
To consolidate the aforementioned steric hindrance model, we applied a previously reported biotin-based antisense oligo pull-down method to enrich the U1 snRNP complex from HeLa NE (15). The 11 nt biotin-labeled DNA, which is Figure 1. Effect of U1 AMO on U1 snRNP structure in vitro. A, schematic representation of components of U1 snRNP complex. B, coimmunoprecipitation (IP) analysis using antibodies against control immunoglobulin G (IgG) or U1A in the presence of control or U1 AMO followed by Western blotting analysis (for detection of U1-70K, U1A, and U1C) and Northern blotting analysis (for detection of U1 snRNA). C, gel mobility shift assays using detectable amount of radiolabeled U1 snRNA (about 0.2 μM) and recombinant GST-U1-70K (5 μM) or GST-U1A (5 μM) protein in the presence of control or U1 AMO (0, 0.02, 0.2, and 2 μM). The protein-RNA complex, free RNA, and degraded RNA products are indicated with arrows. D, gel mobility shift assays using detectable amount of radiolabeled U1 snRNA (about 0.2 μM) and recombinant GST-U1-70K protein (5 μM) in the presence of U1 AMO or DNA oligos (2 μM). The protein-RNA complex, free RNA, and degraded RNA products are indicated with arrows. The sequences of the DNA oligonucleotides are listed in the left panel. E, DNAbiotin-based pull-down assays using the indicated oligos and HeLa NE followed by Western blotting analysis (for detection of U1-70K, U1A, and U1C) and Northern blotting analysis (for detection of U1 snRNA). The sequence of the biotin-labeled DNA oligonucleotides is listed in the upper panel. AMO, antisense morpholino oligonucleotide; GST, glutathione-S-transferase; NE, nuclear extract; snRNP, small nuclear ribonucleoprotein.
complementary to the free 5 0 end of U1 snRNA, could enrich significant amount of U1-70K and U1C in the pull-down assay (Fig. 1E). In contrast, the 25 nt biotin-labeled DNA pull-down sample returned undetectable U1-70K and U1C proteins. Northern blotting analysis of U1 snRNA demonstrated that both biotin-DNA oligos could enrich U1 snRNA with similar efficiency (Fig. 1E), excluding the possibility that the observed difference resulted from differential U1 snRNA pull-down efficiency (Fig. 1E).

U1 AMO may affect U1 snRNP structural integrity in vivo and impact RNAPII activity
To investigate if U1 AMO could affect U1 snRNP structure in vivo, we performed control and U1 AMO transfection in HeLa cells and subsequently prepared NE for the same aformentioned co-IP analysis. It appeared that the U1 inhibitory effect is more pronounced than the results obtained in the aforementioned experiment (Figs. 1B and S2A). This was partly because the nucleus contains less U1-70K and U1C protein upon U1 AMO transfection. To confirm this, we examined their expression in the whole cells, nucleus, and cytoplasm, respectively. The results demonstrated that U1 AMO treatment consistently led to an apparent accumulation of U1-70K-U1C proteins in the cytoplasm and a corresponding downregulation of U1-70K-U1C in the nucleus ( Fig. 2A). In contrast, their overall protein expression was not much disturbed. It must be noted that the observed difference was not caused by potential crosscontamination between cytoplasm and nucleus extracts, as demonstrated by the expression pattern of several nuclear and cytoplasmic protein markers, including lamin B1 and Erp29. In addition, U1 AMO did not appear to cause the U1 snRNA accumulation in the cytoplasm, as revealed by the Northern blotting analysis ( Fig. 2A).
To complement the results observed in the Western blotting analysis, we performed immunostaining for U1C. An apparent immunostaining signal was observed in the cytoplasm for U1 AMO-treated cells, whereas it was exclusively localized in the nucleus for control cells (Fig. 2B). Further quantification of U1C export efficiency by calculating the ratio of intensity signal in the cytoplasm and whole cells confirmed the significance of the change (Fig. 2B). These results suggested that U1 AMO transfection significantly affected the overall U1 snRNP structural integrity in vivo. We also carried out the aforementioned experiments in another human cell line SW480, and the results were the same obtained in HeLa cells (Fig. S2, B and C). Overall, we showed that U1 AMO could disrupt U1 snRNP structure in vivo, arguing against the prevalent U1-CPAF model that U1 snRNP structure is not perturbed upon U1 AMO transfection (6).
Recent studies have shown that U1 snRNP associates with RNAPII (9, 16-18); we next investigated if this association is disrupted in U1 AMO-treated cells. Using a well-established antibody (8wg16) targeting the CTD (C-terminal domain) of RPB1, the largest subunit of RNAPII, our co-IP analysis confirmed that RNAPII may interact with U1C in the control HeLa cells (Fig. 2C). This interaction might be independent of nucleic acids, gene transcription, and FUS protein, as shown by the similar results under the treatment of benzonase, actinomycin D, or siRNAs targeting FUS protein (Fig. S2D). As expected, this interaction was significantly disrupted in U1 AMO-treated HeLa cells (Fig. 2C).
We further asked if U1 AMO has a direct impact on RNAPII transcription activity. To examine this, we first utilized an in vitro reconstitution system previously applied to characterize the structure of U1-snRNP-RNAPII (19). In this system, a synthesized pre-mRNA was hybridized to bulged DNA template, mimicking the gene transcription process (Fig. S2A). Instead of using purified U1 snRNP and RNAPII components, we used HeLa NE for in vitro assay. Indeed, by adopting the well-established MS2-tagged RNA affinity purification method (19,20), U1 snRNP and RPB1 could be detected in the pulldown sample (Fig. S3A). Mutation of U1 snRNA binding site apparently decreased the binding of both U1 snRNP and RPB1, indicating that U1 snRNP contributes to the association of RPB1 with DNA-pre-mRNA. As expected, little U1 snRNP proteins were detected in the presence of U1 AMO. Significantly, decreased pull-down efficiency was observed for RPB1 when U1 AMO was added in the NE compared with the control AMO, as evidenced by Western blotting analysis (Fig. S3A). These results indicated that RPB1 tended to fall off DNA-pre-mRNA hybrid in the presence of U1 AMO. To further substantiate this point, pre-mRNA was radiolabeled in the aforementioned reaction mixture (containing both DNA and wildtype pre-mRNA), and RNA-IP analysis was carried out using RPB1 antibody. Similar as U1A, RPB1 showed a significantly weaker association with pre-mRNA upon U1 AMO treatment (Fig. S3B).
We subsequently investigated the impact of U1 AMO on two RNAPII CTD modifications, namely Ser5P and Ser2P, by ChIP-Seq analysis (two biological replicates for each sample), given their direct links with active gene transcription in vivo (21)(22)(23). Consistent with the previous RNAPII ChIP-Seq analysis (3), upon U1 AMO treatment, a progressive decrease in RNAPII Ser5P and Ser2P signal was observed for all the actively expressed genes (Figs. 2D and S4A; Table S1). This result is in line with the observation that detectable decreases were observed for Ser5P and Ser2P in chromatin fraction upon U1 AMO transfection (Fig. S4B). To gain insight into U1 snRNP telescripting, we divided the actively expressed genes into intronic PCPAed genes and nonintronic PCPAed genes, using our recently published 3 0 -seq data and Quanti-fyPoly(A) pipeline (24,25). In total, we found that the usage of 21,248 intronic PASs distributed among 5937 genes was significantly elevated upon U1 AMO treatment (Log2 [fold change] >1; p < 0.05) (Table S2). Therefore, intronic PCPAed genes hereafter refer to these 5937 genes, and nonintronic PCPAed genes refer to other 7280 genes (in total 13,217 genes), if not indicated otherwise. Indeed, an apparent difference between the two subgroups was observed (Fig. 2D). For intronic PCPAed genes, a sharp increase in RNAPII Ser2P signal, together with an apparent increase in RNAPII signal (Fig. S4A), was observed between the TSS and intronic PAS, Figure 2. Effect of U1 AMO on U1 snRNP structure in vivo. All the experiments were performed using HeLa cells. A, Western blotting analysis and Northern blotting analysis (for detection of U1 snRNA) to examine the subcellular distribution of U1 snRNP components and indicated protein markers upon control or U1 AMO transfection. Ctr represents control AMO-treated cells, and U1 represents U1 AMO-treated cells. B, immunostaining of U1C, GAPDH, and CFIm25 proteins in control and U1 AMO-treated cells (Ctr represents control AMO, and U1 represents U1 AMO). The scale bar of 10 μm is shown in the right bottom of the picture. Graphs show the percentages of exported efficiencies (cytoplasm/total immunostaining signal) quantified using ImageJ software. The results are from 20 different randomly selected cells from three independent microscope images. Data are shown as mean ± SEM. Student's t test was performed to examine the significance of the difference. *p < 0.05; ns, nonsignificant change. C, coimmunoprecipitation (IP) analysis using control immunoglobulin G (IgG) or 8wg16 antibodies in the presence of control or U1 AMO followed by Western blotting analysis to detect the indicated proteins. The input is 1.5% of the total lysates. D, metagene plots of RNAPII Ser2P/Ser5P and PCF11 chromatin immunoprecipitation (ChIP-Seq) reads in control and U1 AMO-treated HeLa cells for all actively expressed genes (n = 13,217), intronic PCPAed genes (n = 5937), and other nonintronic PCPAed genes (n = 7280). For intronic PCPAed genes, a second metagene plot was made for each ChIP-Seq by replacing the default TES (transcription end site) with intronic PCPA sites. The arrows are used to emphasize the difference of the ChIP-Seq profile near TES between the two groups of genes. Control represents control AMOtreated cells, and U1 represents U1 AMO-treated cells. PCPA, premature cleavage and polyadenylation; RNAPII, RNA polymerase II.
implicating a scenario that RNAPII movement tended to pause toward intronic PAS, thereby providing a time window for intronic PAS processing. For those non-PCPAed genes, RNAPII directly fell off transcribing region in U1 AMOtreated cells (Figs. 2D and S4A). In addition, we performed ChIP-Seq analysis for PCF11, a transcription terminationassociated factor (26,27), and a similar discrepancy was observed for the two groups of genes (Figs. 2D and S4C). Fig. S4D shows each group's ChIP-Seq(s) profiles for several representative genes.
Core 3 0 processing factors CPSF/CstF play roles in intronic PAS processing Furthermore, we wished to identify 3 0 processing machinery associated with the processing of intronic PASs. Although it is predicted that intronic PASs and canonical 3 0 UTR PASs share common 3 0 processing machinery based on common PAS RNA cis-elements (Fig. S5, A and B) (24, 28), more direct experimental evidence still needs to be provided. To this goal, we adopted a similar RNA affinity approach previously used to purify the canonical PAS processing complex (20,29). Briefly, three copies of MS2 hairpin were fused to the intronic PAS of the nr3c1 gene (Fig. 3A), which is a bona fide PCPA target upon U1 AMO treatment (Fig. S4D). To make a negativecontrol RNA substrate, the core hexamer AAUAAA within PAS was mutated to AACAAA. The RNA substrate was incubated with HeLa NE, and the mRNA 3 0 processing complex was purified using an MS2 tag. After RNA pull-down, silver staining and quantitative protein analysis by MS were performed to estimate the abundance of 3 0 processing factors. Indeed, MS analysis followed by Western blotting validation indicated that most of the top hits were known CPSF/CstF factors (Fig. 3, B and C; Table S3). By mutating the CPSF/CstF binding sites of several intronic PASs, we further confirmed that CPSF/CstF play positive roles in processing of the intronic PAS using in vitro mRNA 3 0 ' processing assays and a luciferase reporter assay that was reported previously (Fig. S5, C and D) (30). Our results therefore provided more direct experimental Figure 3. Effect of U1 AMO on the recruitment of core intronic PAS processing factors onto chromatin and pre-mRNA. A, in vitro cleavage/ polyadenylation and in vitro cleavage assays using HeLa NE and RNA substrates derived from intronic PAS of nr3c1 gene. Three repeats of MS2 sequences were added to the 5 0 end of the PAS RNA to facilitate the process of protein purification. As negative control, a mutant PAS RNA harboring a point mutation in the core hexamer AAUAAA was prepared simultaneously. The PAS RNA sequences are listed in Table S4. The polyadenylated or cleaved RNA products are indicated with red arrows. B, silver staining of protein complexes purified from HeLa NE using the indicated PAS RNAs. Potential intronic PAS processing factors, which are indicated with red arrows, could be purified by wt but not m1 PAS RNAs. C, validation of mass spectrometric results by Western blotting analysis using antibodies against indicated proteins. The input is 2% of the total lysates. D, metagene plots of CPSF160, WDR33, and CSTF77 chromatin immunoprecipitation (ChIP-Seq) reads in control and U1 AMO-treated HeLa cells for all actively expressed genes (n = 13,217), intronic PCPAed genes (n = 5937), and other nonintronic PCPAed genes (n = 7280). Control represents control AMO-treated HeLa cells, and U1 represents U1 AMO-treated HeLa cells. E, metagene plots of WDR33 and CSTF64 iCLIP-Seq reads in control and U1 AMO-treated HeLa cells for all actively expressed genes (n = 13,217), intronic PCPAed genes (n = 5937), and other nonintronic PCPAed genes (n = 7280). For intronic PCPAed genes, a second metagene plot was made for each ChIP-Seq/iCLIP-Seq by replacing the default TES (transcription end site) with intronic PCPA sites. Control represents control AMO-treated HeLa cells, and U1 represents U1 AMO-treated HeLa cells. F, a schematic diagram summarizing the key finding in our study. The details are described in the main body of the text. iCLIP-Seq, individual-nucleotide resolution CrossLinking and ImmunoPrecipitation sequencing; PCPA, premature cleavage and polyadenylation. evidence that known core 3 0 processing factors CPSF/CstF were involved in intronic PAS processing upon U1 AMO treatment.

U1 AMO treatment manipulates CPSF/CstF cotranscriptionally
We hypothesized that CPSF/CstF associations with transcribed genes might be manipulated under the U1 AMO treatment condition. ChIP-Seq(s) were performed to test most of known core CPSF/CstF factors (two biological replicates for each sample). Consistent with the current cotranscriptional recruitment model of 3 0 processing factors (31, 32), a sharp peak at TSS in the ChIP-SSeq profile for WDR33, CPSF30, CPSF100, FIP1, and CstF77 in control cells was detected. In contrast, another major peak was observed near transcription end site only for WDR33, FIP1, and CstF77 (Figs. 3D and S6A), coinciding with their roles in canonical PAS processing. Overall, we found that most of the ChIP-Seq profiles investigated for CPSF/CstF factors showed detectable changes in U1 AMO-treated cells (Figs. 3D and S6A). Among them, CPSF160 showed the most striking change, and it appeared that U1 AMO activated CPSF160 association with genomic regions of both intronic PCPAed and remaining actively transcribed genes. Wdr33, CPSF100, and CstF77 also showed mildly elevated signals across the gene body for all the actively expressed genes. However, for CPSF30 and FIP1, a slight increase in binding signal was only observed for intronic PCPAed genes, whereas the opposite trend was observed for non-PCPAed genes.
We subsequently mapped protein-RNA interactions in vivo by individual-nucleotide resolution CrossLinking and Immu-noPrecipitation sequencing (iCLIP-Seq) for two canonical CPSF/CstF factors, WDR33 and CstF64, before and after U1 AMO treatment (two biological replicates for each sample). Our iCLIP-Seq(s) were reliable based on the specific binding patterns and enriched motifs (Figs. 3E and S6B) (30,33,34). Two major observations were made from the iCLIP-Seq analysis between the control and U1 AMO-treated cells. First, for intronic PCPAed genes, an elevated iCLIP-Seq signal was detected across the gene body for both WDR33 and CstF64 upon U1 AMO treatment (Fig. 3E), consistent with the results obtained from ChIP-Seq data and suggested the U1 AMO enhanced CPSF/CstF associations with DNA-pre-mRNA cotranscriptionally. Second, for non-PCPAed genes, the change in iCLIP-Seq signal value was not as apparent as that in PCPAed genes (Fig. 3E), consistent with the results from bioinformatics analysis that non-PCPAed transcripts tend to be short and have less PAS motifs recognized by CPSF/ CstF (Fig. S5B). Fig. S6C shows several representative examples illustrated by snapshots of mapped reads using the IGV software (Broad Institute).

Discussion
To summarize this study, a schematic model has been proposed in Figure 3F. In addition to splicing, U1 snRNP could be a key player in sustaining transcription elongation by associating with RNAPII in control cells. Although CPSF/CstF could be loaded onto chromatin during the transcription initiation step, they are not fully assembled onto pre-mRNA 3 0 UTR PAS until RNAPII reaches the end of transcription unit, wherein RNAPII CTD Ser2 is heavily phosphorylated. In U1 AMO-treated cells, the integral U1 snRNP complex might fall apart (some of the U1-specific proteins could be exported to the cytoplasm), which affected its association with RNAPII and ultimately affected transcription elongation. Based on the well-accepted "first come, first served" cotranscriptional mRNA 3 0 processing model (35,36), the CPSF/CstF complex could be fully assembled onto cryptic PASs of intronic PCPAed genes once U1 AMO-triggered premature transcription termination occurs. The intronic mRNA 3 0 processing, RNAPII pausing, and RNAPII CTD Ser2P could be reciprocally regulated near intronic PAS (37)(38)(39). In contrast, nonintronic PCPAed genes tend to be short and have less intronic PAS, and their 3 0 end formation takes place soon after the 3 0 UTR PAS is transcribed.
To elucidate the U1 snRNP telescripting mechanism, the previous U1-CPAF activation model has focused on the compositions and functions of U1 snRNP-associated factors (4)(5)(6)(7)(8). Unlike this model, we found that the 25 nt U1 AMO could significantly affect the protein-protein interaction and protein-RNA interaction within the U1 snRNP complex, both in vitro and in vivo. Furthermore, we provided evidence that the activation of CPAF upon U1 AMO treatment is associated with RNAPII pausing and RNAPII CTD Ser2P. Although CFIm68 may be involved in this process, as suggested by U1-CPAF model (6), RNAPII CTD and Ser2P might also be involved, given that RNAPII CTD and Ser2P play key roles in cotranscriptional mRNA 3 0 processing (37)(38)(39). In addition, we showed that the movement of core 3 0 processing factors CPSF/ CstF across gene body is globally modified upon U1 AMO transfection, providing additional insight into the mechanism of U1 snRNP telescripting.
Our study has also raised several interesting questions for future investigation. First, consistent with current cotranscriptional mRNA 3 0 end processing model (32,(37)(38)(39), RNAPII pausing and high RNAPII Ser2P density were observed near the intronic PAS for those intronic PCPAed genes upon U1 AMO treatment, whereas the trend was not apparent for non-PCPAed genes (Fig. 2D), indicating that Ser2P and 3 0 end processing might be uncoupled during the process of transcription termination for these non-PCPAed genes under U1 AMO condition, which adds a layer of complexity to the regulatory mechanism of cotranscriptional mRNA 3 0 end processing. Second, although our study and previous studies have suggested that the U1 snRNP and RNAPII might have physical interactions (Fig. 2C) (9,(16)(17)(18); it remains unclear if this association is direct or indirect, particularly in a cellular context, which may enable us to understand how U1 AMO caused the differential RNAPII activity.
Finally, although we have provided data that U1 AMO disrupts U1 snRNP structure and affects transcription elongation, we still do not know how U1 protects against intronic PAS usage, and whether it associates with the role of U1 in splicing needs to be further investigated, given the intimate links between 3 0 processing, splicing, and transcription regulation (32).

Experimental procedures
Cell culture and oligonucleotide/plasmid transfection HeLa cells were cultured in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum. AMO transfections were performed as previously described. The sequence of the 25-mer U1 AMO is 5ʹ-GGTATCTCCC CTGCCAGGTAAGTAT-3ʹ. The sequence of control AMO is 5ʹ-CCTCTTACCTCAGTTACAATTTATA-3ʹ. AMOs were ordered from Gene Tools. Transfection of luciferase reporter plasmid pPASPORT constructs was carried out using Lipofectamine 2000 (Thermofisher) according to the user manual. Transfection of siRNAs was carried out using Lipofectamine RNAimax (Thermofisher). The siRNA target sequences are listed in Table S4.
MS2-tagged RNA affinity purification of mRNA 3 0 processing complex The RNA affinity purification of mRNA 3 0 processing complex was carried out essentially as described before (20,29). Protein pellet was analyzed by MS to identify 3 0 processing factors or by SDS-PAGE and silver staining or Western blotting analyses. Similar MS2-tagged RNA affinity coupled with Western blotting analysis was performed, as provided in Fig. S3.

IP
Protein IPs were carried out using standard procedures recommended by antibody supplier Abcam. For RNA IPs, RNAs were extracted by Trizol reagents (Thermofisher), Northern blot analysis of U1 snRNA was performed using 5ʹradiolabeled DNA probes (sequence is listed in Table S4) in ultrasensitive hybridization buffer (Thermofisher). The radioactivity signals were analyzed by PhosphorImager (Typhoon FLA 7000).

DNA-biotin based pull-down assay
Biotinylated DNA oligos were ordered from SYNBIO technologies (sequences are listed in Table S4). Biotinylated DNAs were first bound to the streptavidin beads and then were incubated with HeLa NE in the polyadenylation conditions for 5 min. After biotin-streptavidin binding and washing, pull-down samples were heated at 95 C for 5 min in 1× SDS-PAGE sample loading buffer. The eluted samples were further subject to Western blotting analysis.

In vitro mRNA 3ʹ processing assays
In vitro cleavage/polyadenylation and in vitro cleavage assays were performed using P 32 -radiolabeled RNA sequences and HeLa NEs following the standard protocol as described elsewhere (20,30,40). HeLa NEs were purchased from IPRACELL.

Luciferase reporter assay
HeLa cells were harvested after 24 h of transfection with pPASPORT plasmids. Luciferase activity was measured using Promega Dual-Luciferase Reporter kit and Berthold Sirius detection system.

Nuclear/cytoplasm extraction
A Nuclear/Cytosol Extraction Kit (Abcam) was used to prepare nuclear/cytoplasm proteins/RNAs for Western blotting analysis.

Western blotting
The primary antibodies for 3 0 processing factors were purchased from Bethyl Laboratories. Other primary antibodies were purchased from Abcam or Santa Cruz Biotechnology (catalog number available on request). For secondary antibodies, we used horseradish peroxidase-conjugated antimouse/rabbit (Sigma). ECL western blotting system (Thermofisher) was used to detect the signals.
ChIP-Seq libraries were prepared using the ChIP-IT Express Enzymatic Shearing (Active Motif) and ChIP-Seq library preparation (Vazyme) kits. Primary antibodies were purchased from Abcam or Santa Cruz Biotechnology. These libraries were sequenced on the NovaSeq platform, and all ChIP-Seq data processing and analysis were performed according to the ENCODE ChIP-Seq pipeline (https://www.encodeproject. org/data-standards/chip-seq/). iCLIP-Seq library preparation and bioinformatics analysis were conducted as previously described (30,41). The primary antibodies (WDR33 [A301-152A] and CstF64 [A301-092A]) were purchased from Bethyl.

Data availability
All the deep sequencing data have been deposited to Gene Expression Omnibus database with the accession no. GSE192943. The MS proteomics data have been deposited to the ProteomeXchange Consortium with the dataset identifier PXD040405.
Supporting information-This article contains supporting information.
Author contributions-R. Y. and A. P. X. methodology; Z. L. and Congting Y. formal analysis; Q. F., Y. R., and Y. D. investigation; Chengguo Y. data curation; Z. L. and Chengguo Y. writing-original draft; Chengguo Y. supervision. Conflict of interest-The authors declare that they have no conflicts of interest with the contents of this article.